智能论文笔记

Advances in Multi-Variate Analysis Methods for New Physics Searches at the Large Hadron Collider

Anna Stakia , Tommaso Dorigo , Giovanni Banelli , Daniela Bortoletto , Alessandro Casa , Pablo de Castro , Christophe Delaere , Julien Donini , Livio Finos , Michele Gallinaro

分类：机器学习

2021-05-16

在2015年和2019年之间，地平线的成员2020年资助的创新培训网络名为“Amva4newphysics”，研究了高能量物理问题的先进多变量分析方法和统计学习工具的定制和应用，并开发了完全新的。其中许多方法已成功地用于提高Cern大型Hadron撞机的地图集和CMS实验所执行的数据分析的敏感性;其他几个人，仍然在测试阶段，承诺进一步提高基本物理参数测量的精确度以及新现象的搜索范围。在本文中，在研究和开发的那些中，最相关的新工具以及对其性能的评估。

translated by 谷歌翻译

Neural Networks beyond explainability: Selective inference for sequence motifs

Antoine Villié , Philippe Veber , Yohann de Castro , Laurent Jacob

分类：机器学习 | (统计)机器学习

2022-12-23

Over the past decade, neural networks have been successful at making predictions from biological sequences, especially in the context of regulatory genomics. As in other fields of deep learning, tools have been devised to extract features such as sequence motifs that can explain the predictions made by a trained network. Here we intend to go beyond explainable machine learning and introduce SEISM, a selective inference procedure to test the association between these extracted features and the predicted phenotype. In particular, we discuss how training a one-layer convolutional network is formally equivalent to selecting motifs maximizing some association score. We adapt existing sampling-based selective inference procedures by quantizing this selection over an infinite set to a large but finite grid. Finally, we show that sampling under a specific choice of parameters is sufficient to characterize the composite null hypothesis typically used for selective inference-a result that goes well beyond our particular framework. We illustrate the behavior of our method in terms of calibration, power and speed and discuss its power/speed trade-off with a simpler data-split strategy. SEISM paves the way to an easier analysis of neural networks used in regulatory genomics, and to more powerful methods for genome wide association studies (GWAS).

translated by 谷歌翻译

Comparison and Evaluation of Methods for a Predict+Optimize Problem in Renewable Energy

Christoph Bergmeir , Frits de Nijs , Abishek Sriramulu , Mahdi Abolghasemi , Richard Bean , John Betts , Quang Bui , Nam Trong Dinh , Nils Einecke , Rasul Esmaeilbeigi

分类：人工智能

2022-12-21

Algorithms that involve both forecasting and optimization are at the core of solutions to many difficult real-world problems, such as in supply chains (inventory optimization), traffic, and in the transition towards carbon-free energy generation in battery/load/production scheduling in sustainable energy systems. Typically, in these scenarios we want to solve an optimization problem that depends on unknown future values, which therefore need to be forecast. As both forecasting and optimization are difficult problems in their own right, relatively few research has been done in this area. This paper presents the findings of the ``IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling," held in 2021. We present a comparison and evaluation of the seven highest-ranked solutions in the competition, to provide researchers with a benchmark problem and to establish the state of the art for this benchmark, with the aim to foster and facilitate research in this area. The competition used data from the Monash Microgrid, as well as weather data and energy market data. It then focused on two main challenges: forecasting renewable energy production and demand, and obtaining an optimal schedule for the activities (lectures) and on-site batteries that lead to the lowest cost of energy. The most accurate forecasts were obtained by gradient-boosted tree and random forest models, and optimization was mostly performed using mixed integer linear and quadratic programming. The winning method predicted different scenarios and optimized over all scenarios jointly using a sample average approximation method.

translated by 谷歌翻译

Robust field-level inference with dark matter halos

Helen Shao , Francisco Villaescusa-Navarro , Pablo Villanueva-Domingo , Romain Teyssier , Lehman H. Garrison , Marco Gatti , Derek Inman , Yueying Ni , Ulrich P. Steinwandel , Mihir Kulkarni

分类：人工智能 | 机器学习

2022-09-14

我们将图形神经网络训练来自小工具N体模拟的光晕目录的神经网络，以执行宇宙学参数的无现场级别可能的推断。目录包含$ \ Lessim $ 5,000 HAROS带质量$ \ gtrsim 10^{10} 〜h^{ - 1} m_ \ odot $，定期卷为$（25〜H^{ - 1} {\ rm mpc}）{\ rm mpc}） ^3 $;目录中的每个光环都具有多种特性，例如位置，质量，速度，浓度和最大圆速度。我们的模型构建为置换，翻译和旋转的不变性，不施加最低限度的规模来提取信息，并能够以平均值来推断$ \ omega _ {\ rm m} $和$ \ sigma_8 $的值$ \ sim6 \％$的相对误差分别使用位置加上速度和位置加上质量。更重要的是，我们发现我们的模型非常强大：他们可以推断出使用数千个N-n-Body模拟的Halo目录进行测试时，使用五个不同的N-进行测试时，在使用Halo目录进行测试时，$ \ omega _ {\ rm m} $和$ \ sigma_8 $身体代码：算盘，Cubep $^3 $ M，Enzo，PKDGrav3和Ramses。令人惊讶的是，经过培训的模型推断$ \ omega _ {\ rm m} $在对数千个最先进的骆驼水力动力模拟进行测试时也可以使用，该模拟使用四个不同的代码和子网格物理实现。使用诸如浓度和最大循环速度之类的光环特性允许我们的模型提取更多信息，而牺牲了模型的鲁棒性。这可能会发生，因为不同的N体代码不会在与这些参数相对应的相关尺度上收敛。

translated by 谷歌翻译

HammingMesh: A Network Topology for Large-Scale Deep Learning

Torsten Hoefler , Tommaso Bonato , Daniele De Sensi , Salvatore Di Girolamo , Shigang Li , Marco Heddes , Jon Belk , Deepak Goel , Miguel Castro , Steve Scott

分类：人工智能

2022-09-03

许多微体系式优化为深度神经网络解锁了巨大的处理能力，从而促进了AI革命。随着这种优化的精疲力尽，现代AI的增长现在是通过培训系统的性能，尤其是其数据流动的。我们没有专注于单个加速器，而是研究了全系统规模的大规模培训的数据移动特征。基于我们的工作量分析，我们设计了HammingMesh，这是一种新颖的网络拓扑，以低成本提供高的带宽，并具有很高的工作计划灵活性。具体而言，HammingMesh可以支持具有两个并行性的两个维度的深度学习培训工作的完整带宽和隔离。此外，它还为通用流量的高全球带宽提供支持。因此，HammingMesh将为未来的大规模深度学习系统供电，并具有极端的带宽要求。

translated by 谷歌翻译

BERTIN: Efficient Pre-Training of a Spanish Language Model using Perplexity Sampling

Javier de la Rosa , Eduardo G. Ponferrada , Paulo Villegas , Pablo Gonzalez de Prado Salas , Manu Romero , Marıa Grandury

分类：自然语言处理 | 人工智能

2022-07-14

在计算和数据方面，大型语言模型的预培训通常需要大量资源。经常使用的Web源（例如Common Crawl）可能包含足够的噪声，以使这种预训练的亚地区。在这项工作中，我们尝试了西班牙语版本的MC4的不同采样方法，并提出了一种新颖的以数据为中心的技术，我们将其命名为$ \ textit {Perplexity sampling} $，该技术可实现大约一半的语言模型的预培训步骤并使用五分之一的数据。最终的模型与当前的最新机构相当，甚至可以为某些任务获得更好的结果。我们的工作证明了变形金刚的多功能性，并为小型团队以有限的预算培训模型铺平了道路。我们的型号可在此$ \ href {https://huggingface.co/bertin-project} {url} $中获得。

translated by 谷歌翻译

The State of Sparse Training in Deep Reinforcement Learning

Laura Graesser , Utku Evci , Erich Elsen , Pablo Samuel Castro

分类：机器学习 | 人工智能

2022-06-17

近年来，稀疏神经网络的使用迅速增长，尤其是在计算机视觉中。它们的吸引力在很大程度上源于培训和存储所需的参数数量以及学习效率的提高。有些令人惊讶的是，很少有努力探索他们在深度强化学习中的使用（DRL）。在这项工作中，我们进行了系统的调查，以在各种DRL代理和环境上应用许多现有的稀疏培训技术。我们的结果证实了计算机视觉域中稀疏训练的发现 - 稀疏网络在DRL域中对相同的参数计数的稀疏网络表现更好。我们提供了有关DRL中各种组件如何受到稀疏网络的影响的详细分析，并通过建议有希望的途径提高稀疏训练方法的有效性以及推进其在DRL中的使用来结论。

translated by 谷歌翻译

Machine learning approaches for COVID-19 detection from chest X-ray imaging: A Systematic Review

Harold Brayan Arteaga-Arteaga , Melissa delaPava , Alejandro Mora-Rubio , Mario Alejandro Bravo-Ortíz , Jesus Alejandro Alzate-Grisales , Daniel Arias-Garzón , Luis Humberto López-Murillo , Felipe Buitrago-Carmona , Juan Pablo Villa-Pulgarín , Esteban Mercado-Ruiz

分类：计算机视觉 | 机器学习

2022-06-11

有必要开发负担得起且可靠的诊断工具，该工具允许包含COVID-19的扩散。已经提出了机器学习（ML）算法来设计支持决策系统以评估胸部X射线图像，事实证明，这些图像可用于检测和评估疾病进展。许多研究文章围绕此主题发表，这使得很难确定未来工作的最佳方法。本文介绍了使用胸部X射线图像应用于COVID-19检测的ML的系统综述，旨在就方法，体系结构，数据库和当前局限性为研究人员提供基线。

translated by 谷歌翻译

A Novel Partitioned Approach for Reduced Order Model -- Finite Element Model (ROM-FEM) and ROM-ROM Coupling

Amy de Castro , Paul Kuberry , Irina Tezaur , Pavel Bochev

分类：机器学习

2022-06-09

分区方法允许人们通过重复现有的单组分代码来构建耦合问题的仿真能力。这样做，分区方法可以缩短多物理和多尺度应用程序的代码开发和验证时间。在这项工作中，我们考虑了一种场景，其中一个或多个“代码”耦合为基于投影的减少订单模型（ROM），以降低与特定组件相关的计算成本。我们通过考虑在两个非重叠子域中独立离散化的模型接口问题来模拟这种情况。然后，我们为此问题制定了一个分区方案，该方案允许使用有限元模型（FEM）或ROM“代码”的一个子域中的ROM“代码”耦合。 ROM“代码”是通过在快照集合上执行正确的正交分解（POD）来构建的，以获得低维的降低订单基础，然后在此基础上进行Galerkin投影。然后，使用代表接口通量的Lagrange乘法器耦合每个子域上的ROM和/或FEM“代码”。为了划分所得的整体问题，我们首先通过双重schur补体消除了通量。将显式时间集成方案应用于转换的单片问题，将子域方程解散，从而在下一步步骤中独立解决方案。我们显示了数值结果，这些结果证明了所提出的方法在实现ROM-FEM和ROM-ROM耦合方面的功效。

translated by 谷歌翻译

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Aarohi Srivastava , Abhinav Rastogi , Abhishek Rao , Abu Awal Md Shoeb , Abubakar Abid , Adam Fisch , Adam R. Brown , Adam Santoro , Aditya Gupta , Adrià Garriga-Alonso

分类：自然语言处理 | 人工智能 | 机器学习 | (统计)机器学习

2022-06-09

语言模型既展示了定量的改进，又展示了新的定性功能，随着规模的增加。尽管它们具有潜在的变革性影响，但这些新能力的特征却很差。为了为未来的研究提供信息，为破坏性的新模型能力做准备，并改善社会有害的效果，至关重要的是，我们必须了解目前和近乎未来的能力和语言模型的局限性。为了应对这一挑战，我们介绍了超越模仿游戏基准（Big Bench）。 Big Bench目前由204个任务组成，由132家机构的442位作者贡献。任务主题是多样的，从语言学，儿童发展，数学，常识性推理，生物学，物理学，社会偏见，软件开发等等。 Big-Bench专注于被认为超出当前语言模型的功能的任务。我们评估了OpenAI的GPT型号，Google内部密集变压器体系结构和大型基础上的开关稀疏变压器的行为，跨越了数百万到数十亿个参数。此外，一个人类专家评估者团队执行了所有任务，以提供强大的基准。研究结果包括：模型性能和校准都随规模改善，但绝对的术语（以及与评估者的性能相比）；在模型类中的性能非常相似，尽管带有稀疏性。逐渐和预测的任务通常涉及大量知识或记忆成分，而在临界规模上表现出“突破性”行为的任务通常涉及多个步骤或组成部分或脆性指标；社交偏见通常会随着含糊不清的环境而随着规模而增加，但这可以通过提示来改善。

translated by 谷歌翻译